What is claimed is: 

1 . A method for transposing data in a plurality of processing elements, comprising: 
a plurality of shifting operations; and 

a plurality of storing operations, said shifting and storing operations coordinated to 
enable data to be stored along a diagonal of processing elements from at least one first direction 
and to be output from said diagonal in at least one second direction perpendicular to said first 
direction, and wherein said plurality of storing operations is responsive to the processing 
element's position. 

2. The method of claim 1 wherein said plurality of storing operations are responsive to 
initial coimts which are one of loaded into at least certain of said processing elements and 
calculated locally based on the processing element's location. 

3. The method of claim 2 additionally comprising maintaining a current count in each 
processing element for each initial count, said current counts being responsive to said initial 
counts and the number of data shifts performed. 

4. The method of claim 3 wherein said maintaining current counts includes altering said 
initial counts at programmable intervals by a progranmiable amoxmt. 

5. The method of claim 4 wherein said initial counts are decremented in response to a 
shifting of data to produce said current counts. 

6. The method of claim 5 wherein a storing operation is performed when a current count in 
a processing element is nonpositive. 

7. The method of claim 1 wherein the first and second directions are selected from among 
the X, z and y directions. 

8. A method for transposing data in a plxirality of processing elements, comprising: 
a first plurality of shifting and storing operations coordinated to enable data to be 

collected from along a first direction and stored along a second direction perpendicular to said 
first direction; and 

a second plurahty of shifting and storing operations coordinated to enable data to be 
collected from along a third direction opposite to said first direction and stored along a fourth 
direction opposite to said second direction, and wherein said first and second plurality of storing 
operations is responsive to the processing element's position. 
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9. The method of claim 8 wherein said first and second plurality of storing operations are 
responsive to initial counts which are one of loaded into at least certain of said processing 
elements and calculated locally based on the processing element's location. 

10. The method of claim 9 additionally comprising maintaining a current count in each 
processing element for each initial count, said current counts being responsive to said initial 
counts and the number of data shifts performed. 

11. The method of claim 10 wherein said maintaining current counts includes altering said 
initial counts at programmable intervals by a programmable amount. 

12. The method of claim 1 1 wherein said initial counts are decremented in response to a 
shifting of data to produce said current counts. 

13. The method of claim 12 wherein a processing element performs a storing operation when 
its current count is nonpositive. 

14. The method of claim 8 wherein the plurality of processing elements is arranged in an 
array, and wherein said first and second directions and the third and fourth directions are selected 
fi-om among the dimensions of the array including the +x/-x, -i-z/-z and +y/-y pairs of directions. 

15. A method for transposing data in a plurality of processing elements, comprising: 
a first shifting of data in a first direction; 

a first storing of data along a diagonal of processing elements in response to said first 
shifting; 

a second shifting of data firom said diagonal so as to output the stored data fi-om said 
diagonal, said second shifting being in a second direction perpendicular to said first direction; 

a second storing of data by at least certain of said processing elements in response to said 
second shifting and in response to the processing element's position; 

a third shifting of data in a third direction opposite to said first direction; 

a third storing of data along said diagonal in response to said third shifting; 

a fourth shifting of data firom said diagonal so as to output the data stored in response to 
said third shifting, said fourth shifting of data being in a fourth direction opposite to said second 
direction; and 

a fourth storing of data by at least certain other of said processing elements in response to 
said fourth shifting and in response to the processing element's position. 
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16. The method of claim 15 wherein said first, second, third and fourth storing are each 
responsive to initial counts which are one of loaded into at least certain of said processing 
elements and calculated locally based on the processing element's location. 

17. The method of claim 16 additionally comprising maintaining a current count in each 
processing element for each initial coimt, said current counts being responsive to said initial 
coxmts and the number of data shifts performed. 

18. The method of claim 17 wherein said maintaining current counts includes altering said 
initial counts at programmable intervals by a programmable amount. 

19. The method of claim 18 wherein said initial counts are decremented in response to a 
shifting of data to produce said current counts. 

20. The method of claim 19 wherein said first, second, third and fourth storing are each 
responsive to current counts. 

21. The method of claim 15 wherein the plurahty of processing elements is arranged in an 
array and wherein the first and second directions and the third and fourth directions are selected 
fi-om among the dimensions of the array including the +x/-x, +z/-z and +y/-y pairs of directions. 

22. A method for transposing data in a plurality of processing elements, comprising: 

a plurality of shifting and storing operations coordinated to enable data to be collected 
fi-om along a first direction and stored along a second direction perpendicular to said first 
direction, and wherein said plurality of storing operations is responsive to the processing 
element's position. 

23. The method of claim 22 wherein said plurality of storing operations are responsive to 
initial counts which are one of loaded into at least certain of said processing elements and 
calculated locally based on the processing element's location. 

24. The method of claim 23 additionally comprising maintaining a current count in each 
processing element for each initial count, said current counts being responsive to said initial 
counts and the number of data shifts performed. 

25. The method of claim 24 wherein said maintaining current counts includes altering said 
initial counts at programmable intervals by a programmable amount. 

26. The method of claim 25 wherein said initial counts are decremented in response to a 
shifting of data to produce said current counts. 
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27. The method of claim 26 wherein a processing element performs a storing operation when 
its current count is nonpositive. 

28. The method of claim 22 wherein the plurality of processing elements is arranged in an 
array, and wherein the first and second directions are selected fi-om among the dimensions of the 
array including the x, z and y directions. 

29. A method for transposing data in a plurality of processing elements, comprising: 
a first shifting of data in a first direction; 

a first storing of data along a diagonal of processing elements in response to said first 
shifting; 

a second shifting of data fi'om said diagonal so as to output the stored data from said 
diagonal, said second shifting being in a second direction perpendicular to said first direction; 
and 

a second storing of data by at least certain of said processing elements in response to said 
second shifting and in response to the processing element's position. 

30. The method of claim 29 wherein said first and second storing are each responsive to 
initial counts which are one of loaded into at least certain of said processing elements and 
calculated locally based on the processing element's location. 

3 1 . The method of claim 30 additionally comprising maintaining a current count in each 
processing element for each initial count, said current counts being responsive to said initial 
counts and the number of data shifts performed. 

32. The method of claim 31 wherein said maintaining current counts includes altering said 
initial counts at programmable intervals by a programmable amount. 

33. The method of claim 32 wherein said initial counts are decremented in response to a 
shifting of data to produce said current counts. 

34. The method of claim 33 wherein said first and second storing are each responsive to 
current counts. 

35. The method of claim 29 wherein the plurahty of processing elements is arranged in an 
array and wherein the first and second directions are selected firom among the dimensions of the 
array including the x, z and y directions. 

36. A memory device carrying a set of instructions which, when executed, perform a method 
comprising: 
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a plurality of shifting operations; and 

a plurality of storing operations, said shifting and storing operations coordinated to 
enable data to be stored along a diagonal of processing elements from at least one first direction 
and to be output from said diagonal in at least one second direction perpendicular to said first 
direction, and wherein said plurality of storing operations is responsive to the processing 
element's position. 
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